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WHAT IS CLAIMED IS: 

1 . A method of identifying a sequence of a nucleic acid that is suitable for use 
as a substrate surface immobilized normalization probe, said method comprising: 

5 (a) identifying a plurality of candidate probe sequences for a target 

nucleic acid based on at least one selection criterion; 

(b) empirically evaluating each of said candidate probe sequences under 
a plurality of different experimental sets to obtain a collection of empirical data 
values for each of said candidate nucleic acid probe sequences for each of said 

10 plurality of different experimental sets; 

(c) clustering said candidate probe sequences into one or more groups 
of candidate probe sequences based on each candidate probe sequence's 
collection of empirical data values, wherein each of said one or more groups 
exhibits substantially the same performance across said plurality of experimental 

15 sets; 

(d) evaluating any remaining non-clustering probes for candidate probe 
sequences that satisfy a signal intensity threshold and exhibit substantially no 
variation in signal under said plurality of different experimental sets to identify any 
candidate probe sequences of said plurality that are suitable for use as a substrate 

20 surface immobilized normalization probe. 

2. The method according to Claim 1 , wherein said at least one selection 
criterion employed in said identifying step (a) is chosen from: 

(i) proximity to the 3' end of said target nucleic acid's corresponding 
25 mRNA transcript; 

(ii) base composition; and 

(iii) lack of homology to other expressed sequences of said target nucleic 
acid's organism. 

30 3. The method according to Claim 2, wherein all three of said selection criteria 
(i), (ii) and (iii) are employed is said identifying step (a). 
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4. The method according to Claim 3, wherein said identifying step (a) is further 
characterized by employing parameters that minimize the number of identified 
candidate probe sequences that overlap with each other. 

5 5. The method according to Claim 1 , wherein said empirically evaluating step 
(b) comprises for each member of said plurality of different experimental 
conditions: 

(i) providing an array of candidate nucleic acid probes immobilized on a 
surface of a solid support, wherein said array includes a substrate surface 

10 immobilized nucleic acid candidate probe for each of said identified candidate 
probe sequences; and 

(ii) subjecting said array to said member of said plurality of different 
experimental sets. 

15 6. The method according to Claim 5, wherein each member of said plurality of 
different experimental conditions is a different tissue/cell line differential gene 
expression assay. 

7. The method according to Claim 1, said clustering step (c) comprises: 
20 (i) obtaining an expression vector for each of said candidate probe 

sequences using said candidate sequence's collection of empirical data values; 

(ii) deriving a similarity matrix for the set of said candidate probe 
sequences from said candidate probe sequences 7 expression vectors; and 

(iii) grouping said candidate probe sequences based on their derived 
25 similarity. 

8. The method according to Claim 7, wherein those candidate probes that 
have substantially similar expression patterns are grouped together. 

30 9. The method according to Claim 1 , wherein the clustering step employs an 
affinity threshold or another stringency controlling parameter. 

10. The method according to Claim 1 , wherein a candidate probe sequence is 
considered to exhibit substantially no variation in signal under said plurality of 
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different experimental sets if its log ratio is not significantly different than zero 
accorss said plurality of different experimental sets. 

11. The method according to Claim 10, wherein said log ratio is between about 
5 0.5 and -0.5. 

12. The method according to Claim 1, wherein said purality of different 
experimental sets is at least 2. 

10 13. The method according to Claim 12, wherein if no non-clustering probes are 
present after said clustering step (c), said evaluating step (d) is not performed. 

14. The method according to Claim 1 , wherein at least some of said steps are 
carried out by a computational analysis system. 

15 

15. A computer-readable medium having recorded thereon a program that 
identifies a sequence of a nucleic acid that is suitable for use as a substrate 
surface immobilized normalization probe according to the method of Claim 1. 

20 16. A computational analysis system comprising a computer-readable medium 
according to Claim 15. 

17. A method of producing a nucleic acid array, said method comprising: 
producing at least two different probe nucleic acids immobilized on a 

25 surface of a solid support, wherein at least one of said at least two different probe 
nucleic acids is a normalization probe that has a sequence of nucleotides identified 
according to the method of Claim 1 . 

18. The method according to Claim 17, wherein said at least two different probe 
30 nucleic acids are produced on said surface of said solid support by synthesizing 

said probe nucleic acids on said surface. 
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19. The method according to Claim 17, wherein said at least two different probe 
nucleic acids are produced on said surface of said solid support by depositing said 
at least two different probe nucleic acids onto said surface of said solid support. 

5 20. A nucleic acid array produced according to the method of Claim 17. 

21. A method of detecting the presence of a nucleic acid analyte in a sample, 
said method comprising: 

(a) contacting a nucleic acid array according to Claim 20 having a 

10 nucleic acid probe that specifically binds to said nucleic acid analyte with a sample 
suspected of comprising said analyte under conditions sufficient for binding of said 
analyte to said nucleic acid ligand on said array to occur; and 

(b) detecting the presence of binding complexes on the surface of said 
array to detect the presence of said analyte in said sample. 

15 

22. A method comprising transmitting a result of a reading of an array obtained 
according to the method of Claim 20 from a first location to a second location. 

23. The method according to Claim 22, wherein said second location is a 
20 remote location. 

24. A method comprising receiving a transmitted result of a reading of an array 
obtained according to the method Claim 20. 

25 25. A kit for identifying a sequence of a nucleic acid that is suitable for use as a 
substrate surface immobilized normalization probe, said kit comprising: 

(a) an algorithm that identifies a sequence of a nucleic acid that is 
suitable for use as a substrate surface immobilized normalization probe according 
to the method according to Claim 1, wherein said algorithm is present on a 

30 computer readable medium; and 

(b) instructions for using said algorithm to identify said sequence of a 
nucleic acid that is suitable for use as a substrate surface immobilized 
normalization probe . 



42 



